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^ ■ ABSTRACT 

. We present estimates of cosmological parameters from the apphcation of the Karhunen-Loeve transform 

to the analysis of the 3D power spectrum of density fluctuations using Sloan Digital Sky Survey galaxy 
! redshifts. We use fim/i and fb = ^b/^m to describe the shape of the power spectrum, a^g for the (linearly 

D ' extrapolated) normalization, and j3 to parametrize linear theory redshift space distortions. On scales 

t^H ; k< 0.16/iMpc"\ our maximum hkelihood values are n^h = 0.264 ± 0.043, h = 0.286 ± 0.065, a^g = 

t-^ ■ 0.966±0.048, and /3 = 0.45±0.12. When we take a prior on Q.b from WMAP, we find Q.mh = 0.207±0.030, 

which is in excellent agreement with WMAP and 2dF. This indicates that we have reasonably measured 
the gross shape of the power spectrum but we have difficulty breaking the degeneracy between r2,„/i and 
' because the baryon oscillations are not resolved in the current spectroscopic survey window function. 



> 



Subject headings: cosmology: theory — galaxies: distances and redshifts — large-scale structure of the 
universe — methods: statistical 

1. INTRODUCTION 



Redshift surveys are an extremely useful tool to study the large scale distribution of galaxies. Of the many possible 
statistical estimators the power spectrum of the density fluctuations has emerged as one of the easiest to connect to 
I theories of structure formation in the Universe, especially in the limit of Gaussian fluctuations where the power spectrum 
, is the complete statistical description. There are several ways to measure the power spectrum (for a comparison of 
fS |- techniques see Tegmark et al. 1998). Over the last few years, the Karhunen-Loeve method (Vogeley & Szalay 1996, , 
' \ hereafter VS96) has been recognized as the optimal way to build an orthogonal basis set for likelihood analysis, even if 
^ . the underlying survey has a very irregular footprint on the sky. A variant of the same technique is used for the analysis 
^ of CMS fluctuations (Bond, Jaffe, & Knox 2000). 

, The shape of the power spectrum is well described by a small set of parameters (Eisenstein & Hu 1998). For redshift 
!• ■ surveys, it is of particular importance to consider the large-scale anisotropics caused by infall (Kaiser 1987). Using a 
. 5^ ■ forward technique that compares models directly to the data, like the KL-transform, enables us to easily consider these 
, anisotropies in full detail. Here we present results of a parametric analysis of the shape of the fluctuation spectrum for 
' the SDSS galaxy catalog. 
\ 
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The Sloan Digital Sky Survey (SDSS; York et al. 2000; Stoughton et al. 2002) plans to map nearly one quarter of the 
sky using a dedicated 2.5 meter telescope at Apache Point Observatory in New Mexico. A drift-scanning CCD camera 
(Gunn et al. 1998) is used to image the sky with custom set of 5 filters {ugriz) (Fukugita et al. 1996; Smith et al. 2002) to 
a limiting Petrosian (1976) magnitude of nir ~ 22.5. Observations are calibrated using a 0.5 meter photometric telescope 
(Hogg, Finkbeincr, Schlegcl, & Gimn 2001). After a stripe of sky has been imaged, reduced, and astromctrically calibrated 
(Pier et al. 2003) , additional automated software selects potential targets for spectroscopy. These targets are assigned to 
3° diameter (possibly overlapping) circles on the sky called tiles (Blanton et al. 2003a). Aluminum plates drilled from the 
tile patterns hold optical fibers that feed into the SDSS spectrographs (Uomoto et al. 1999). The SDSS Main Galaxy 
Sample (MGS; Strauss et al. 2002) will consist of spectra of nearly one million low redshift {{z) ~ 0.1) galaxies creating 
a three dimensional map of local large scale structure. 

2.2. Large Scale Structure Sample 

Considerable effort has been invested in preparing SDSS MGS redshift data for large scale structure studies. The first 
task is to correct for fiber collisions. The minimum separation between optical fibers is 55" which causes a correlated loss 
of redshifts in areas covered by a single plate. Galaxy targets that were not observed due to collisions are assigned the 
redshift of their nearest neighbor. Next the sky is divided into unique regions of overlapping spectroscopic plates called 
sectors. The angular completeness is calculated for each sector as if the collided galaxies had been successfully measured. 
Galaxy magnitudes arc extinction-corrected with the Schlegcl, Finkbeiner, & Davis (1998) dust maps, then k-corrections 
are applied and rest frame colors and luminosities are calculated (Blanton et al. 2003b). Subsamples are created by 
making appropriate cuts in luminosity, color, and/or flux. A luminosity function is then calculated for each subsample 
(Blanton et al. 2003c) and used to create a radial selection function assuming ilm = 0.3 and n\ = 0.7 cosmology. 

This analysis considers two samples of SDSS data, which we will label sample 10 and sample 12. Both samples were 
prepared in similar manners, although using different versions of software. Sample 12 represents a later state of the survey 
and the sample 10 area is contained in sample 12. Sample 10 represents 1983.39 completeness-weighted square degrees of 
spectroscopically observed SDSS data and 165,812 MGS redshifts. Sample 12 has 205,484 redshifts over 2406.74 square 
degrees. Both samples are larger than the 1360 square degrees of spectroscopy in data release 1 (DRl; Abazajian et al. 
2003) of the SDSS. The geometry of the samples and DRl are qualitatively similar, consisting of two thick slices in the 
northern cap of the survey and three thin stripes in the south. The samples used have a luminosity cut of —19 > Mr > —22, 
where h = 1.0 and M* = —20.44 (Blanton et al. 2003c). Rest frame quantities (ie absolute magnitudes) are given for the 
SDSS filters at z=0.1, the median depth of the MGS. In a study of the two point correlation function of SDSS galaxy 
redshifts, Zehavi et al. (2002) found that the bias relative to galaxies varies from 0.8 for galaxies with M = Af* + 1.5 
to 1.2 for galaxies with M = — 1.5. Norberg et al. (2001) found similar results for the 2dF, with the trend becoming 
more pronounced at luminosities significantly greater than L*. The dependence of clustering strength on luminosity 
could induce an extra tilt in the power spectrum because more luminous galaxies contribute more at large scales and less 
luminous galaxies contribute more at small scales due to the number of available baselines. We minimize this effect by 
staying within M = M*± ^1.5. A uniform flux limit of < 17.5 was applied, leaving 110,345 redshifts for sample 10 
and 134,141 for sample 12. Although there are luminosity limits for this sample, it is essentially a flux limited sample 
with a (slowly) varying selection function. We used galaxies in the redshift range 0.05 < z < 0.17. 

3. ALGORITHM 

3.1. The Karhunen-Loeve Eigenbasis 

Following the strategy described in VS96, the first step in a Karhunen-Loeve (KL) eigenmode analysis of a redshift 
survey is to divide the survey volume into cells and use the vector of galaxy counts within the cells as our data. This 
allows a large compression in the size of the dataset without a loss of information on large scales. Our data vector of 
fluctuations d is defined as 

di = Ci/rii - 1 (1) 

where Ci is the observed number of galaxies in the i"^ cell and rii = (ci) is its expected value, calculated from the angular 
completeness and radial selection function. The data is "whitened" by the factor l/rij to control shot noise properties in 
the transform (VS96). We call this the "overdensity" convention. 

The KL modes are the solutions to the eigenvalue problem = A„^„ with the correlation matrix of the data given 
by 

Rij = (didj) = + (5y/nj + ri^j/^n^nj) (2) 

where is the cell-averaged correlation matrix, Sij/rii is the shot noise term, and T]ij/{^i^j) can be used to account for 
correlated noise (not used in this analysis). The most obvious source of correlated noise in the MGS would be differences 
in photometric zero points between different SDSS imaging runs, which would result in "zebra stripe" patterns of density 
fluctuations. The MGS selection has a magnitude limit, but no color selection terms, so the variation in target density 
depends only linearly on the photometric calibration. The r band zero point variation is 0.02 mag rms (Abazajian et al. 
2003), indicating that the density variation should be < 2%. The transformed data vector B is the expansion of d over 
the KL modes 
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d = J2Bn^n. (3) 

n 

The KL basis is defined by two properties: orthonormafity of the basis vectors, ■ *n = Smn, and statistically 
orthogonality of the transformed data, (BmBn) = {B^)6mn- 



3.2. The Correlation Function in Redshift Space 

In order to directly compare cosmological models to our redshift data using a two point statistic we must calculate the 

redshift space correlation function ^^■'^(ri. rj), where and rj describe positions in the observable angles and redshift. 
The infall onto large scale structures affects the velocities of galaxies leading to an anisotropy in redshift space for a power 
spectrum that is isotropic in real space (Kaiser 1987). Szalay, Matsubara, & Landy (1998) derived an expansion of the 
correlation function that accounts for this anisotropy in linear theory for arbitrary angles. The expansion is 



e^^' {Ti, rj) = coo^r + co2d°^ + co4d°^ + (4) 

= 1^1 dkek--jL{kr)P{k) (5) 

where the c„l coefficients arc polynomials of /3 and functions of the relative geometry of the two points. The quantity 
(3 relates infall velocity to matter density and is well approximated by the fitting formula /3 = /h where h is the bias 
parameter. Further terms in Eq. (4) are negligible as long as 2 + 91n^(r) / 91nr (where r is the distance to the cell and 
cf){r) is the radial selection function) does not significantly differ (ie orders of magnitude) from unity. For the redshift 
range considered in this analysis |2 + d\n.(j){r) / d\n.r\ < 4. When using counts-in-cells, we must calculate the cell-averaged 
correlation matrix 

^ij = j d^n j d3r2^(«)(ri, r2)Wi(xi - Yi)Wj{^j - T2) (6) 

where Wi(y) is the cell window function and Xj is the position of the i*^ cell. To be precise, Wi{y) should describe 

the shape of the cell in redshift space. Numerical calculation of this multi-dimensional integral can be computationally 
expensive. However, for the case of spherically symmetric cells we can change the order of integration and perform the 
redshift space integrals in Eq. (6) analytically before the /c-space integral in Eq. (5) . If both cells have the same window 
function, we can use Eq. (4) as our cell-averaged correlation function (with and Vj indicating the cell positions) if we 

replace P{k) with P{k)W'^{k) in Eq. (5) where W{k) is the Fourier transform of the cell window function. This results 
in a one dimensional numerical integral. The full technical details of our method will be presented in Matsubara et al. 
(2004, in preparation). 

We used hard spheres as our cell shape and placed them in a hexagonal closest packed (the most efficient 3D packing, 
with a 74% space- filling factor) arrangement. The current slice-like survey geometry and packing arrangement causes 
some spheres to partially protrude outside the survey. The effective fraction of the sphere that is sampled is also affected 
by the angular completeness of our survey (which averages ~ 97%). We calculate our expected counts as if the sphere 
was entirely filled and multiply the observed galaxy counts by 1/ fi where /, is the fraction of the ith sphere's volume 
that was effectively sampled. This sparser sampling also increases the shot noise by a factor of Cells with fi < 0.65 
were rejected as too incomplete. We found that a Qhr^Mpc sphere radius allowed us to fill the survey volume with a 
computationally feasible number of cells without the spheres protruding too much out of the survey, while smoothing on 
sufficiently small length scales so that we do not lose information in the linear regime (27r/fc > 40/i~^Mpc). We used 
14,194 cells for sample 10 and 16,924 for sample 12. 

The calculation of the sampling fraction for each cell is difficult due to the complicated shapes of the sectors (§2.2). We 
created a high resolution angular completeness map in a SQL Server database using 10^ random angular points over the 
entire sky. Each point was assigned a completeness weighting by finding which sector contained the point or setting the 
completeness to zero for points outside the survey area. We used a Hierarchical Triangular Mesh (HTM; Kunszt, Szalay, 
& Thakar 2001) spatial indexing scheme to find all points in the completeness map that pierce a cell and calculate the 
volume weighted completeness for that cell. 



3.3. Eigenm,ode Selection 

The KL transform is linear, so there is no loss of information if we use all of the eigenmodes. However, if we perform 
a truncated expansion we can use the KL transform for compression and filtering. The difference between the original 
data vector and a truncated reconstruction, d = X^i!^]^^ Bi^i, where we use only M out of a possible N modes can be 
related to the eigenvalues of the excluded modes by (d — d)^ = J2^=m+i ^i- ^be error is minimized (in a squared sense) 
when we retain modes with larger eigenvalues and drop modes with smaller eigenvalues, which is sometimes called optimal 
subspace filtering (Therricn 1992). 

The eigenvalue of a KL mode is also related to the range in fc-space sampled by that mode. Our models assume 
that linear theory is a good approximation, which is only valid on larger scales. Consequently we only wish to use KL 
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modes that fall inside a "Fermi sphere" whose radius is set by our cutoff wavenumber fc/. If we sort modes by decreasing 
eigenvalue, they will densely pack /c-space starting from the origin. The modes resist overlapping in fc-space due to 
orthogonality. The shape of a KL mode in k-spa.ce resembles the Fourier transform of the survey window function. This 
means that the number of KL modes within the "Fermi sphere" depends mostly on the survey window function and does 
not drastically change if we change the size of our cells, as long as we have significantly more cells than modes (which 
means that our cells must be smaller than the cutoff wavelength). In a fully three dimensional survey the modes would 
fill fc-space roughly spherically and M oc kj. However, the current SDSS geometry resembles several two dimensional 
slices, resulting in KL modes that resemble cigars in fc-space. These modes pack layer-by-layer into spherical shells whose 
diameters are integer multiples of the long axis of the mode. See Fig. 5 in Szalay et al. (2003) for a visualization. This 
results in a scaling more like M oc kj. 

In choosing the number of KL modes to use in our analysis we try to keep as many modes as possible for better 
constraints on our parameter values while requiring that our modes are consistent with linear theory. We have developed 
a convenient method for determing the range in fc-space probed by each KL mode. We separate the integral in Eq. (5) 
into bandpowers in k. This allows us to determine how strongly each mode couples to each bandpower, which shows a 
coarse picture of the spherically averaged position of the mode in fc-space. Fig. 1 shows a grayscale image of how the 
modes couple to the bandpowers. Once we choose a value for the cutoff wavenumber kf, we truncate our expansion at 
the mode where wavcnumbers larger than kf start to dominate. 

We can iiac the statistical properties of the transformed data to check that we are avoiding non-linearities. A resealed 
version of the KL coefScients 6„ = Bn/V^n should be normally distributed. Non-linear effects would cause skewness 
(third moment) and/or kurtosis (fourth moment) in the distribution of 6„. We do not see evidence of non-linear effects 
when we use kj < 0.16/iMpc~^ (corresponding to length scales ^ir/kf > 40/i~^Mpc). This value for the cutoff wavenumber 
leaves us with 1500 modes for sample 10 and 1850 modes for sample 12. 

3.4. Model Testing 

We estimate cosmological parameters by performing maximum likelihood analysis in KL space. The likelihood of the 
observed data given a model m is 

C{B\m) = (27r)"^/2|C„|-V2exp [-iB^C-iB] (7) 
where Cm is the covariance matrix and can be calculated as the projected model correlation matrix, 

{Cmh^ = {B,B,)ra = *fR™*,. (8) 

Our method is based upon a linear comparison of models to data, thus the (and Cm) model matrices only contain 
second moments of the density field. This linear estimator is computationally more expensive than quadratic or higher 
order estimators, but the results are less sensitive to non-linearities. For a comparison of different estimation methods, 
see Tegmark et al. (1998). 

In practice we must decide on an explicit parametrization. We construct a power spectrum assuming a primordial 
spectrum of fluctuations with a spectral index = 1. We use a fitting formula from Eisenstein & Hu (1998) to 
characterize the transfer function, including the baryon oscillations. We flt for flmh and fb = flb/^m while taking a 
prior of Hq = 72 ± 8 km s~^ from the Hubble key project (Freedman et al. 2001) and fixing Tcmb — '2.72SK (Fixsen 
et al. 1996). We fit the linearly extrapolated a^g for normalization, where a^g = bagm and b is the bias. Linear theory 
redshift-space distortions are characterized by (3 (sec §3.2). 

In order to search an appreciable portion of parameter space we have developped efficient methods to calculate the 
model covariance matrices Cm- The straightforward approach would be to calculate the model correlation matrix for a 
set of parameters and then project into the KL basis and calculate a likelihood, but this is computationally expensive. 
The covariance matrix can easily be written as a linear combination of matrices and powers of a^g and /3 (see §3.2), so we 
can project pieces of the correlation matrix and add them in the appropriate proportions for those parameters. However, 
the shape of the power spectrum depends on Qm, fb, and Hq in a non-trivial way. We projec;t eac;li bandpower of the 
correlation matrix (see §3.3) separately and add the pieces of the covariance matrix together with appropriate weighting 
to represent different power spectrum shapes. This alleviates the need for further projections. We must be careful when 
choosing our bandpowers so that we retain sufficient resolution to accurately mimic power spectrum shapes (especially 
baryon oscillations), but we must also be careful that our k ranges are large enough that the integrals converge correctly. 

Note that a non-optimal choice of fiducial parameters does not bias our results, but it can result in non-minimal error 
bars. This procedure can be iterated if necessary. 

4. RESULTS AND DISCUSSION 

Our best-fit maximum- likelihood parameter values for samples 10 and 12 are presented in Table 1. Results are given 
for the priors described in §3.4 and also when using the additional prior i^b = 0.047 ± 0.006 from WMAP (Spergel et 
al. 2003). We show the results of sample 10 and 12 to give some indication of sample variance, although sample 10 is a 
subset of sample 12. 
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The middle column of Fig. 2 shows the marginalized one-dimensional and two-dimensional confidence regions for the 
power spectrum shape parameters i^mh and /& for sample 10 without the additional prior on ilf There is a strong corre- 
lation between flmh and /(,. The gross shape of the power spectrum (ie ignoring the baryon oscillations and concentrating 
on the position of the peak and slope of the tail) is nearly constant along the ridge of this correlation due to a degeneracy 
between shifting the position of the peak with Clrnh and adding power to the peak with /t,. However, the strength of 
the baryon oscillations varies significantly over this range. Table 1 shows that our estimates of flmh agree well with the 
WMAP value of 0.194 ± 0.04 (Spergel et al. 2003) and the 2dF value of 0.20 ± 0.03 (Percival et al. 2001) when we use 
the additional prior on fij, and the associated confidence regions are shown in the left column of Fig. 2. The results 
with the fif, prior indicate that the gross shape of the power spectrum we measure is consistent with WMAP and 2dF, 
as can be seen in Fig. 3 which shows the (isotropic) real-space power spectra inferred from the cosmological parameter 
estimates from the three surveys. However, the results without the fib prior show that we have difficulty breaking the 
degeneracy between Qmh and fb because the baryon oscillations are not resolved due to the current state of the SDSS 
window function. 

The right column of Fig. 2 shows the marginalized one-dimensional and two-dimensional confidence regions for a^g 
(normalization) and fi (distortions) for sample 10. Again there is a strong correlation between these parameters, which is 
expected from their dependence on b. Our constraint on a^g is strong, but we can only measure /? to ^ 20% which limits 
our ability to perform an independent estimate of b. We can compare our results to WMAP by examining the combination 
of parameters cr|g/? = cTg^O^^, for which we obtain the value 0.44 ± 0.12, in excellent agreement with the WMAP result 
of 0.44 ± 0.10 (Spergel et al. 2003). By combining our measurements with WMAP results we find b = 1.07 ± 0.13 for 
our galaxy sample, but this compares information dominated by galaxies with redshifts 0.1 < z < 0.15 to present-day 
matter. If we use a KCDM model to extrapolate to the present, we would find b w 1.16. Our galaxies cover a range of 
luminosities but our signal is dominated by the more luminous galaxies (brighter than L*) because there are more long 
baselines available for the more distant galaxies. This must be kept in mind when comparing our measurement of a^g 
with other estimates using SDSS data which focus on galaxies (Szalay et al. 2003; Tegmark et al. 2003) 

This analysis used less than one third of the data that will comprise the completed SDSS survey. Our ability to measure 
cosmological parameters will increase as the survey area increases, but we should also gain leverage in resolving features 
in the power spectrum as our survey window function becomes cleaner. The thickest slice of data from the samples used 
was roughly 10°, implying a thickness of ~ 50/i~^Mpc at 2; ~ 0.1. As the slices become thicker, the KL modes will become 
much more compact in that direction in fc-space. Thus we will benefit from the change in the survey aspect ratio in 
addition to the increase in survey area. 
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Table 1 
Best Fit Parameter Values 



10 



12 



12 + f2h 



nh 0.264 ±0.043 

fb 0.286 ±0.065 

a^g 0.966 ±0.048 
P 0.45 ±0.12 



0.207 ±0.030 
0.163 ±0.031 
0.971 ±0.049 
0.44 ±0.12 



0.270 ±0.057 
0.233 ±0.088 
0.978 ±0.043 
0.44 ±0.11 



0.229 ±0.029 
0.149 ±0.026 
0.980 ±0.043 
0.43 ±0.11 



Note. — Maximum likelihood parameter values and 68% confi- 
dences (marginalized over all other parameters). f2f, indicates that a 
WMAP prior was used. 
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Fig. 1. — Grayscale image of wave number vs. mode number. The horizontal red line indicates fe/ = 0.16/iMpc . The vertical black line 
indicates the truncated number of modes used for likelihood analysis. 
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Fig. 2. — Likelihoods for parameters using sample 10. The left column shows the power spectrum shape parameters with an ili, prior. 
The middle column shows the power spectrum shape parameters without an Qi, prior. The right column shows normalization and distortion 
parameters. The contours in the joint parameter plots are the two-dimensional 1, 2, and 3 a contours. The points in the vs. f2„i/i plots 
are MCMC points from WMAP (alone). Parameter combinations not plotted are nearly uncorrelated. 
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k(h Mpc"'') 

Fig. 3. — Plots of the real space P(k) from best-fit model parameters for SDSS (sample 10 with and without the Cli, prior), WMAP, and 
2dF. All use at from the SDSS for normalization. The vertical dotted lines indicate the range in k used in the SDSS analysis. 



